153 research outputs found

    Expert-Guided Subgroup Discovery: Methodology and Application

    Full text link
    This paper presents an approach to expert-guided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized definition of rule quality which is analyzed in detail. The other important steps of the proposed subgroup discovery process are the detection of statistically significant properties of selected subgroups and subgroup visualization: statistically significant properties are used to enrich the descriptions of induced subgroups, while the visualization shows subgroup properties in the form of distributions of the numbers of examples in the subgroups. The approach is illustrated by the results obtained for a medical problem of early detection of patient risk groups

    A Methodology for Mining Document-Enriched Heterogeneous Information Networks

    Full text link

    A decision support tool for health service re-design

    Get PDF
    Many of the outpatient services are currently only available in hospitals, however there are plans to provide some of these services alongside with General Practitioners. Consequently, General Practitioners could soon be based at polyclinics. These changes have caused a number of concerns to Hounslow Primary Care Trust (PCT). For example, which of the outpatient services are to be shifted from the hospital to the polyclinic? What are the current and expected future demands for these services? To tackle some of these concerns, the first phase of this project explores the set of specialties that are frequently visited in a sequence (using sequential association rules). The second phase develops an Excel based spreadsheet tool to compute the current and expected future demands for the selected specialties. From the sequential association rule algorithm, endocrinology and ophthalmology were found to be highly associated (i.e. frequently visited in a sequence), which means that these two specialties could easily be shifted from the hospital environment to the polyclinic. We illustrated the Excel based spreadsheet tool for endocrinology and ophthalmology, however, the model is generic enough to cope with other specialties, provided that the data are available

    A hybrid, auto-adaptive, and rule-based multi-agent approach using evolutionary algorithms for improved searching

    Full text link
    Selecting the most appropriate heuristic for solving a specific problem is not easy, for many reasons. This article focuses on one of these reasons: traditionally, the solution search process has operated in a given manner regardless of the specific problem being solved, and the process has been the same regardless of the size, complexity and domain of the problem. To cope with this situation, search processes should mould the search into areas of the search space that are meaningful for the problem. This article builds on previous work in the development of a multi-agent paradigm using techniques derived from knowledge discovery (data-mining techniques) on databases of so-far visited solutions. The aim is to improve the search mechanisms, increase computational efficiency and use rules to enrich the formulation of optimization problems, while reducing the search space and catering to realistic problems.Izquierdo Sebastián, J.; Montalvo Arango, I.; Campbell, E.; Pérez García, R. (2015). A hybrid, auto-adaptive, and rule-based multi-agent approach using evolutionary algorithms for improved searching. Engineering Optimization. 1-13. doi:10.1080/0305215X.2015.1107434S113Becker, U., & Fahrmeir, L. (2001). Bump Hunting for Risk: a New Data Mining Tool and its Applications. Computational Statistics, 16(3), 373-386. doi:10.1007/s001800100073Bouguessa, M., & Shengrui Wang. (2009). Mining Projected Clusters in High-Dimensional Spaces. IEEE Transactions on Knowledge and Data Engineering, 21(4), 507-522. doi:10.1109/tkde.2008.162Chong, I.-G., & Jun, C.-H. (2005). Performance of some variable selection methods when multicollinearity is present. Chemometrics and Intelligent Laboratory Systems, 78(1-2), 103-112. doi:10.1016/j.chemolab.2004.12.011CHONG, I., & JUN, C. (2008). Flexible patient rule induction method for optimizing process variables in discrete type. Expert Systems with Applications, 34(4), 3014-3020. doi:10.1016/j.eswa.2007.05.047Cole, S. W., Galic, Z., & Zack, J. A. (2003). Controlling false-negative errors in microarray differential expression analysis: a PRIM approach. Bioinformatics, 19(14), 1808-1816. doi:10.1093/bioinformatics/btg242FRIEDMAN, J. H., & FISHER, N. I. (1999). Statistics and Computing, 9(2), 123-143. doi:10.1023/a:1008894516817Geem, Z. W. (2006). Optimal cost design of water distribution networks using harmony search. Engineering Optimization, 38(3), 259-277. doi:10.1080/03052150500467430Goncalves, L. B., Vellasco, M. M. B. R., Pacheco, M. A. C., & Flavio Joaquim de Souza. (2006). Inverted hierarchical neuro-fuzzy BSP system: a novel neuro-fuzzy model for pattern classification and rule extraction in databases. IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 36(2), 236-248. doi:10.1109/tsmcc.2004.843220Hastie, T., Friedman, J., & Tibshirani, R. (2001). The Elements of Statistical Learning. Springer Series in Statistics. doi:10.1007/978-0-387-21606-5Chih-Ming Hsu, & Ming-Syan Chen. (2009). On the Design and Applicability of Distance Functions in High-Dimensional Data Space. IEEE Transactions on Knowledge and Data Engineering, 21(4), 523-536. doi:10.1109/tkde.2008.178Hwang, S.-F., & He, R.-S. (2006). A hybrid real-parameter genetic algorithm for function optimization. Advanced Engineering Informatics, 20(1), 7-21. doi:10.1016/j.aei.2005.09.001Izquierdo, J., Montalvo, I., Pérez, R., & Fuertes, V. S. (2008). Design optimization of wastewater collection networks by PSO. Computers & Mathematics with Applications, 56(3), 777-784. doi:10.1016/j.camwa.2008.02.007Javadi, A. A., Farmani, R., & Tan, T. P. (2005). A hybrid intelligent genetic algorithm. Advanced Engineering Informatics, 19(4), 255-262. doi:10.1016/j.aei.2005.07.003Jin, X., Zhang, J., Gao, J., & Wu, W. (2008). Multi-objective optimization of water supply network rehabilitation with non-dominated sorting Genetic Algorithm-II. Journal of Zhejiang University-SCIENCE A, 9(3), 391-400. doi:10.1631/jzus.a071448Johns, M. B., Keedwell, E., & Savic, D. (2014). Adaptive locally constrained genetic algorithm for least-cost water distribution network design. Journal of Hydroinformatics, 16(2), 288-301. doi:10.2166/hydro.2013.218Jourdan, L., Corne, D., Savic, D., & Walters, G. (2005). Preliminary Investigation of the ‘Learnable Evolution Model’ for Faster/Better Multiobjective Water Systems Design. Evolutionary Multi-Criterion Optimization, 841-855. doi:10.1007/978-3-540-31880-4_58Kamwa, I., Samantaray, S. R., & Joos, G. (2009). Development of Rule-Based Classifiers for Rapid Stability Assessment of Wide-Area Post-Disturbance Records. IEEE Transactions on Power Systems, 24(1), 258-270. doi:10.1109/tpwrs.2008.2009430Kang, D., & Lansey, K. (2012). Revisiting Optimal Water-Distribution System Design: Issues and a Heuristic Hierarchical Approach. Journal of Water Resources Planning and Management, 138(3), 208-217. doi:10.1061/(asce)wr.1943-5452.0000165Keedwell, E., & Khu, S.-T. (2005). A hybrid genetic algorithm for the design of water distribution networks. Engineering Applications of Artificial Intelligence, 18(4), 461-472. doi:10.1016/j.engappai.2004.10.001Kehl, V., & Ulm, K. (2006). Responder identification in clinical trials with censored data. Computational Statistics & Data Analysis, 50(5), 1338-1355. doi:10.1016/j.csda.2004.11.015Liu, X., Minin, V., Huang, Y., Seligson, D. B., & Horvath, S. (2004). Statistical Methods for Analyzing Tissue Microarray Data. Journal of Biopharmaceutical Statistics, 14(3), 671-685. doi:10.1081/bip-200025657Marchi, A., Dandy, G., Wilkins, A., & Rohrlach, H. (2014). Methodology for Comparing Evolutionary Algorithms for Optimization of Water Distribution Systems. Journal of Water Resources Planning and Management, 140(1), 22-31. doi:10.1061/(asce)wr.1943-5452.0000321Martínez-Rodríguez, J. B., Montalvo, I., Izquierdo, J., & Pérez-García, R. (2011). Reliability and Tolerance Comparison in Water Supply Networks. Water Resources Management, 25(5), 1437-1448. doi:10.1007/s11269-010-9753-2McClymont, K., Keedwell, E., Savić, D., & Randall-Smith, M. (2013). A general multi-objective hyper-heuristic for water distribution network design with discolouration risk. Journal of Hydroinformatics, 15(3), 700-716. doi:10.2166/hydro.2012.022McClymont, K., Keedwell, E. C., Savić, D., & Randall-Smith, M. (2014). Automated construction of evolutionary algorithm operators for the bi-objective water distribution network design problem using a genetic programming based hyper-heuristic approach. Journal of Hydroinformatics, 16(2), 302-318. doi:10.2166/hydro.2013.226Michalski, R. S. (2000). Machine Learning, 38(1/2), 9-40. doi:10.1023/a:1007677805582Montalvo, I., Izquierdo, J., Pérez-García, R., & Herrera, M. (2014). Water Distribution System Computer-Aided Design by Agent Swarm Optimization. Computer-Aided Civil and Infrastructure Engineering, 29(6), 433-448. doi:10.1111/mice.12062Montalvo, I., Izquierdo, J., Schwarze, S., & Pérez-García, R. (2010). Multi-objective particle swarm optimization applied to water distribution systems design: An approach with human interaction. Mathematical and Computer Modelling, 52(7-8), 1219-1227. doi:10.1016/j.mcm.2010.02.017Nguyen, V. V., Hartmann, D., & König, M. (2012). A distributed agent-based approach for simulation-based optimization. Advanced Engineering Informatics, 26(4), 814-832. doi:10.1016/j.aei.2012.06.001Nicklow, J., Reed, P., Savic, D., Dessalegne, T., Harrell, L., … Chan-Hilton, A. (2010). State of the Art for Genetic Algorithms and Beyond in Water Resources Planning and Management. Journal of Water Resources Planning and Management, 136(4), 412-432. doi:10.1061/(asce)wr.1943-5452.0000053Onwubolu, G. C., & Babu, B. V. (2004). New Optimization Techniques in Engineering. Studies in Fuzziness and Soft Computing. doi:10.1007/978-3-540-39930-8Pelikan, M., Goldberg, D. E., & Lobo, F. G. (2002). Computational Optimization and Applications, 21(1), 5-20. doi:10.1023/a:1013500812258Reed, P. M., Hadka, D., Herman, J. D., Kasprzyk, J. R., & Kollat, J. B. (2013). Evolutionary multiobjective optimization in water resources: The past, present, and future. Advances in Water Resources, 51, 438-456. doi:10.1016/j.advwatres.2012.01.005Shang, W., Zhao, S., & Shen, Y. (2009). A flexible tolerance genetic algorithm for optimal problems with nonlinear equality constraints. Advanced Engineering Informatics, 23(3), 253-264. doi:10.1016/j.aei.2008.09.001Vrugt, J. A., & Robinson, B. A. (2007). Improved evolutionary optimization from genetically adaptive multimethod search. Proceedings of the National Academy of Sciences, 104(3), 708-711. doi:10.1073/pnas.0610471104Vrugt, J. A., Robinson, B. A., & Hyman, J. M. (2009). Self-Adaptive Multimethod Search for Global Optimization in Real-Parameter Spaces. IEEE Transactions on Evolutionary Computation, 13(2), 243-259. doi:10.1109/tevc.2008.924428Xie, X.-F., & Liu, J. (2008). Graph coloring by multiagent fusion search. Journal of Combinatorial Optimization, 18(2), 99-123. doi:10.1007/s10878-008-9140-6Xiao-Feng Xie, & Jiming Liu. (2009). Multiagent Optimization System for Solving the Traveling Salesman Problem (TSP). IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39(2), 489-502. doi:10.1109/tsmcb.2008.2006910Zheng, F., Simpson, A. R., & Zecchin, A. C. (2013). A decomposition and multistage optimization approach applied to the optimization of water distribution systems with multiple supply sources. Water Resources Research, 49(1), 380-399. doi:10.1029/2012wr013160Zheng, F., Simpson, A. R., & Zecchin, A. C. (2014). Coupled Binary Linear Programming–Differential Evolution Algorithm Approach for Water Distribution System Optimization. Journal of Water Resources Planning and Management, 140(5), 585-597. doi:10.1061/(asce)wr.1943-5452.000036

    Lamb meat quality assessment by support vector machines

    Get PDF
    The correct assessment of meat quality (i.e., to fulfill the consumer's needs) is crucial element within the meat industry. Although there are several factors that affect the perception of taste, tenderness is considered the most important characteristic. In this paper, a Feature Selection procedure, based on a Sensitivity Analysis, is combined with a Support Vector Machine, in order to predict lamb meat tenderness. This real-world problem is defined in terms of two difficult regression tasks, by modeling objective (e.g. Warner-Bratzler Shear force) and subjective (e.g. human taste panel) measurements. In both cases, the proposed solution is competitive when compared with other neural (e.g. Multilayer Perceptron) and Multiple Regression approaches

    Discovering a taste for the unusual: exceptional models for preference mining

    Get PDF
    Exceptional preferences mining (EPM) is a crossover between two subfields of data mining: local pattern mining and preference learning. EPM can be seen as a local pattern mining task that finds subsets of observations where some preference relations between labels significantly deviate from the norm. It is a variant of subgroup discovery, with rankings of labels as the target concept. We employ several quality measures that highlight subgroups featuring exceptional preferences, where the focus of what constitutes exceptional' varies with the quality measure: two measures look for exceptional overall ranking behavior, one measure indicates whether a particular label stands out from the rest, and a fourth measure highlights subgroups with unusual pairwise label ranking behavior. We explore a few datasets and compare with existing techniques. The results confirm that the new task EPM can deliver interesting knowledge.This research has received funding from the ECSEL Joint Undertaking, the framework programme for research and innovation Horizon 2020 (2014-2020) under Grant Agreement Number 662189-MANTIS-2014-1

    Interactive decision support in hepatic surgery

    Get PDF
    BACKGROUND: Hepatic surgery is characterized by complicated operations with a significant peri- and postoperative risk for the patient. We developed a web-based, high-granular research database for comprehensive documentation of all relevant variables to evaluate new surgical techniques. METHODS: To integrate this research system into the clinical setting, we designed an interactive decision support component. The objective is to provide relevant information for the surgeon and the patient to assess preoperatively the risk of a specific surgical procedure. Based on five established predictors of patient outcomes, the risk assessment tool searches for similar cases in the database and aggregates the information to estimate the risk for an individual patient. RESULTS: The physician can verify the analysis and exclude manually non-matching cases according to his expertise. The analysis is visualized by means of a Kaplan-Meier plot. To evaluate the decision support component we analyzed data on 165 patients diagnosed with hepatocellular carcinoma (period 1996–2000). The similarity search provides a two-peak distribution indicating there are groups of similar patients and singular cases which are quite different to the average. The results of the risk estimation are consistent with the observed survival data, but must be interpreted with caution because of the limited number of matching reference cases. CONCLUSION: Critical issues for the decision support system are clinical integration, a transparent and reliable knowledge base and user feedback

    Case-oriented computer-based-training in radiology: concept, implementation and evaluation

    Get PDF
    BACKGROUND: Providing high-quality clinical cases is important for teaching radiology. We developed, implemented and evaluated a program for a university hospital to support this task. METHODS: The system was built with Intranet technology and connected to the Picture Archiving and Communications System (PACS). It contains cases for every user group from students to attendants and is structured according to the ACR-code (American College of Radiology) [2]. Each department member was given an individual account, could gather his teaching cases and put the completed cases into the common database. RESULTS: During 18 months 583 cases containing 4136 images involving all radiological techniques were compiled and 350 cases put into the common case repository. Workflow integration as well as individual interest influenced the personal efforts to participate but an increasing number of cases and minor modifications of the program improved user acceptance continuously. 101 students went through an evaluation which showed a high level of acceptance and a special interest in elaborate documentation. CONCLUSION: Electronic access to reference cases for all department members anytime anywhere is feasible. Critical success factors are workflow integration, reliability, efficient retrieval strategies and incentives for case authoring
    corecore